Search CORE

20 research outputs found

Generalised Pattern Matching Revisited

Author: Dudek Bart?omiej
Gawrychowski Pawe?
Starikovskaya Tatiana
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 37th International Symposium on Theoretical Aspects of Computer Science (STACS 2020)
Publication date: 01/01/2020
Field of study

In the problem of

\texttt{Generalised Pattern Matching}\ (\texttt{GPM})

[STOC'94, Muthukrishnan and Palem], we are given a text

T

of length

n

over an alphabet

\Sigma_T

, a pattern

P

of length

m

over an alphabet

\Sigma_P

, and a matching relationship

\subseteq \Sigma_T \times \Sigma_P

, and must return all substrings of

T

that match

P

(reporting) or the number of mismatches between each substring of

T

of length

m

and

P

(counting). In this work, we improve over all previously known algorithms for this problem for various parameters describing the input instance: *

\mathcal{D}\,

being the maximum number of characters that match a fixed character, *

\mathcal{S}\,

being the number of pairs of matching characters, *

\mathcal{I}\,

being the total number of disjoint intervals of characters that match the

m

characters of the pattern

P

. At the heart of our new deterministic upper bounds for

\mathcal{D}\,

and

\mathcal{S}\,

lies a faster construction of superimposed codes, which solves an open problem posed in [FOCS'97, Indyk] and can be of independent interest. To conclude, we demonstrate first lower bounds for

\texttt{GPM}

. We start by showing that any deterministic or Monte Carlo algorithm for

\texttt{GPM}

must use

\Omega(\mathcal{S})

time, and then proceed to show higher lower bounds for combinatorial algorithms. These bounds show that our algorithms are almost optimal, unless a radically new approach is developed

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Dagstuhl Research Online Publication Server

A Faster Subquadratic Algorithm for the Longest Common Increasing Subsequence Problem

Author: Agrawal Anadi
Gawrychowski Pawe?
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 31st International Symposium on Algorithms and Computation (ISAAC 2020)
Publication date: 01/01/2020
Field of study

The Longest Common Increasing Subsequence (LCIS) is a variant of the classical Longest Common Subsequence (LCS), in which we additionally require the common subsequence to be strictly increasing. While the well-known "Four Russians" technique can be used to find LCS in subquadratic time, it does not seem applicable to LCIS. Recently, Duraj [STACS 2020] used a completely different method based on the combinatorial properties of LCIS to design an

\mathcal{O}(n^2(\log\log n)^2/\log^{1/6}n)

time algorithm. We show that an approach based on exploiting tabulation can be used to construct an asymptotically faster

\mathcal{O}(n^2 \log\log n/\sqrt{\log n})

time algorithm. As our solution avoids using the specific combinatorial properties of LCIS, it can be also adapted for the Longest Common Weakly Increasing Subsequence (LCWIS)

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Value Iteration Using Universal Graphs and the Complexity of Mean Payoff Games

Author: Gawrychowski Pawe?
Ohlmann Pierre
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 45th International Symposium on Mathematical Foundations of Computer Science (MFCS 2020)
Publication date: 01/01/2020
Field of study

We study the computational complexity of solving mean payoff games. This class of games can be seen as an extension of parity games, and they have similar complexity status: in both cases solving them is in NP ? coNP and not known to be in P. In a breakthrough result Calude, Jain, Khoussainov, Li, and Stephan constructed in 2017 a quasipolynomial time algorithm for solving parity games, which was quickly followed by a few other algorithms with the same complexity. Our objective is to investigate how these techniques can be extended to mean payoff games. The starting point is the combinatorial notion of universal trees: all quasipolynomial time algorithms for parity games have been shown to exploit universal trees. Universal graphs extend universal trees to arbitrary (positionally determined) objectives. We show that they yield a family of value iteration algorithms for solving mean payoff games which includes the value iteration algorithm due to Brim, Chaloupka, Doyen, Gentilini, and Raskin. The contribution of this paper is to prove tight bounds on the complexity of algorithms for mean payoff games using universal graphs. We consider two parameters: the largest weight N in absolute value and the number k of weights. The dependence in N in the existing value iteration algorithm is linear, we show that this can be improved to N^{1 - 1/n} and obtain a matching lower bound. However, we show that we cannot break the linear dependence in the exponent in the number k of weights implying that universal graphs do not yield a quasipolynomial time algorithm for solving mean payoff games

Dagstuhl Research Online Publication Server

Finding the KT Partition of a Weighted Graph in Near-Linear Time

Author: Apers Simon
Gawrychowski Pawe?
Lee Troy
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2022)
Publication date: 01/01/2022
Field of study

Dagstuhl Research Online Publication Server

Counting 4-Patterns in Permutations Is Equivalent to Counting 4-Cycles in Graphs

Author: Dudek Bart?omiej
Gawrychowski Pawe?
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 31st International Symposium on Algorithms and Computation (ISAAC 2020)
Publication date: 01/01/2020
Field of study

Permutation ? appears in permutation ? if there exists a subsequence of ? that is order-isomorphic to ?. The natural algorithmic question is to check if ? appears in ?, and if so count the number of occurrences. Only since very recently we know that for any fixed length k, we can check if a given pattern of length k appears in a permutation of length n in time linear in n, but being able to count all such occurrences in f(k)? n^o(k/log k) time would refute the exponential time hypothesis (ETH). Together with practical applications in statistics, this motivates a systematic study of the complexity of counting occurrences for different patterns of fixed small length k. We investigate this question for k = 4. Very recently, Even-Zohar and Leng [arXiv 2019] identified two types of 4-patterns. For the first type they designed an ??(n) time algorithm, while for the second they were able to provide an ??(n^1.5) time algorithm. This brings up the question whether the permutations of the second type are inherently harder than the first type. We establish a connection between counting 4-patterns of the second type and counting 4-cycles (not necessarily induced) in a sparse undirected graph. By designing two-way reductions we show that the complexities of both problems are the same, up to polylogarithmic factors. This allows us to leverage the work done on the latter to provide a reasonable argument for why there is a difference in the complexities for counting 4-patterns of the first and the second type. In particular, even for the seemingly simpler problem of detecting a 4-cycle in a graph on m edges, the best known algorithm works in ?(m^{4/3}) time. Our reductions imply that an ?(n^{4/3-?}) time algorithm for counting occurrences of any 4-pattern of the second type in a permutation of length n would imply an exciting breakthrough for counting (and hence also detecting) 4-cycles. In the other direction, by plugging in the fastest known algorithm for counting 4-cycles, we obtain an algorithm for counting occurrences of any 4-pattern of the second type in ?(n^1.48) time

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Matching Patterns with Variables Under Hamming Distance

Author: Gawrychowski Pawe?
Manea Florin
Siemer Stefan
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 46th International Symposium on Mathematical Foundations of Computer Science (MFCS 2021)
Publication date: 01/01/2021
Field of study

A pattern ? is a string of variables and terminal letters. We say that ? matches a word w, consisting only of terminal letters, if w can be obtained by replacing the variables of ? by terminal words. The matching problem, i.e., deciding whether a given pattern matches a given word, was heavily investigated: it is NP-complete in general, but can be solved efficiently for classes of patterns with restricted structure. In this paper, we approach this problem in a generalized setting, by considering approximate pattern matching under Hamming distance. More precisely, we are interested in what is the minimum Hamming distance between w and any word u obtained by replacing the variables of ? by terminal words. Firstly, we address the class of regular patterns (in which no variable occurs twice) and propose efficient algorithms for this problem, as well as matching conditional lower bounds. We show that the problem can still be solved efficiently if we allow repeated variables, but restrict the way the different variables can be interleaved according to a locality parameter. However, as soon as we allow a variable to occur more than once and its occurrences can be interleaved arbitrarily with those of other variables, even if none of them occurs more than once, the problem becomes intractable

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Order-Preserving Squares in Strings

Author: Gawrychowski Pawe?
Ghazawi Samah
Landau Gad M.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)
Publication date: 01/01/2023
Field of study

Dagstuhl Research Online Publication Server

Dynamic Longest Common Substring in Polylogarithmic Time

Author: Charalampopoulos Panagiotis
Gawrychowski Pawe?
Pokorski Karol
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)
Publication date: 01/01/2020
Field of study

The longest common substring problem consists in finding a longest string that appears as a (contiguous) substring of two input strings. We consider the dynamic variant of this problem, in which we are to maintain two dynamic strings S and T, each of length at most n, that undergo substitutions of letters, in order to be able to return a longest common substring after each substitution. Recently, Amir et al. [ESA 2019] presented a solution for this problem that needs only ??(n^(2/3)) time per update. This brought the challenge of determining whether there exists a faster solution with polylogarithmic update time, or (as is the case for other dynamic problems), we should expect a polynomial (conditional) lower bound. We answer this question by designing a significantly faster algorithm that processes each substitution in amortized log^?(1) n time with high probability. Our solution relies on exploiting the local consistency of the parsing of a collection of dynamic strings due to Gawrychowski et al. [SODA 2018], and on maintaining two dynamic trees with labeled bicolored leaves, so that after each update we can report a pair of nodes, one from each tree, of maximum combined weight, which have at least one common leaf-descendant of each color. We complement this with a lower bound of ?(log n/ log log n) for the update time of any polynomial-size data structure that maintains the LCS of two dynamic strings, even allowing amortization and randomization

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

On Two Measures of Distance Between Fully-Labelled Trees

Author: Bernardini Giulia
Bonizzoni Paola
Gawrychowski Pawe?
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)
Publication date: 01/01/2020
Field of study

The last decade brought a significant increase in the amount of data and a variety of new inference methods for reconstructing the detailed evolutionary history of various cancers. This brings the need of designing efficient procedures for comparing rooted trees representing the evolution of mutations in tumor phylogenies. Bernardini et al. [CPM 2019] recently introduced a notion of the rearrangement distance for fully-labelled trees motivated by this necessity. This notion originates from two operations: one that permutes the labels of the nodes, the other that affects the topology of the tree. Each operation alone defines a distance that can be computed in polynomial time, while the actual rearrangement distance, that combines the two, was proven to be NP-hard. We answer two open question left unanswered by the previous work. First, what is the complexity of computing the permutation distance? Second, is there a constant-factor approximation algorithm for estimating the rearrangement distance between two arbitrary trees? We answer the first one by showing, via a two-way reduction, that calculating the permutation distance between two trees on n nodes is equivalent, up to polylogarithmic factors, to finding the largest cardinality matching in a sparse bipartite graph. In particular, by plugging in the algorithm of Liu and Sidford [ArXiv 2020], we obtain an ??(n^{4/3+o(1}) time algorithm for computing the permutation distance between two trees on n nodes. Then we answer the second question positively, and design a linear-time constant-factor approximation algorithm that does not need any assumption on the trees

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Trieste

Dagstuhl Research Online Publication Server

Efficient Labeling for Reachability in Directed Acyclic Graphs

Author: Dul?ba Maciej
Gawrychowski Pawe?
Janczewski Wojciech
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 31st International Symposium on Algorithms and Computation (ISAAC 2020)
Publication date: 01/01/2020
Field of study

Dagstuhl Research Online Publication Server